10 research outputs found
High Accuracy Determination of Rheological Properties of Drilling Fluids Using the Marsh Funnel
Efficient and safe drilling operations require precise determination of
rheological properties in drilling fluids, encompassing dynamic viscosity for
Newtonian fluids, and apparent viscosity, plastic viscosity, and yield point
for non-Newtonian fluids. Conventional viscometers like vibrating wire, ZNN-D6,
and Fann-35 offer high accuracy but are limited by cost and complexity in
small-scale industries and labs. To address this, our research presents a novel
mathematical model based on the Herschel-Bulkley model, aiming to accurately
characterise drilling fluids' rheological properties using the Marsh funnel as
an alternative device -- an economical, operator-friendly, and
power-independent equipment. Drawing inspiration from seminal works by Li et
al. (2020), Sedaghat (2017), and Guria et al. (2013), this innovative framework
establishes a universal inverse linear relationship between a fluid's flow
factor and final discharge time. For any fluid, it utilises its density and
flow factor (or final discharge time) to determine all its rheological
properties. Specifically, it evaluates dynamic viscosity for Newtonian fluids,
apparent viscosity, plastic viscosity, and yield point for weighted
non-Newtonian fluids, and apparent viscosity for non-weighted non-Newtonian
fluids, with average systematic errors (against Fann-35 measurements) of 0.39%,
3.52%, 2.17%, 18.38%, and 5.84%, respectively, surpassing the precision of
alternative mathematical models found in the aforementioned literature.
Furthermore, while our framework's precision in plastic viscosity and yield
point assessment of non-weighted non-Newtonian fluids slightly lags behind the
framework of Li et al. (2020), it outperforms the model of Sedaghat (2017). In
conclusion, despite minor limitations, our proposed mathematical model holds
huge promise for drilling fluid rheology in petroleum, drilling, and related
industries.Comment: 57 pages, 1 figure, and 10 tables. Funding for this research work was
provided through the IIChE Research Grant for the academic year 2022-23,
granted by the Indian Institute of Chemical Engineers (IIChE
Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes
When used in a real-world noisy environment, the capacity to generalize to
multiple domains is essential for any autonomous scene text spotting system.
However, existing state-of-the-art methods employ pretraining and fine-tuning
strategies on natural scene datasets, which do not exploit the feature
interaction across other complex domains. In this work, we explore and
investigate the problem of domain-agnostic scene text spotting, i.e., training
a model on multi-domain source data such that it can directly generalize to
target domains rather than being specialized for a specific domain or scenario.
In this regard, we present the community a text spotting validation benchmark
called Under-Water Text (UWT) for noisy underwater scenes to establish an
important case study. Moreover, we also design an efficient super-resolution
based end-to-end transformer baseline called DA-TextSpotter which achieves
comparable or superior performance over existing text spotting architectures
for both regular and arbitrary-shaped scene text spotting benchmarks in terms
of both accuracy and model efficiency. The dataset, code and pre-trained models
will be released upon acceptance.Comment: 10 image
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation
Instance-level segmentation of documents consists in assigning a class-aware
and instance-aware label to each pixel of the image. It is a key step in
document parsing for their understanding. In this paper, we present a unified
transformer encoder-decoder architecture for en-to-end instance segmentation of
complex layouts in document images. The method adapts a contrastive training
with a mixed query selection for anchor initialization in the decoder. Later
on, it performs a dot product between the obtained query embeddings and the
pixel embedding map (coming from the encoder) for semantic reasoning. Extensive
experimentation on competitive benchmarks like PubLayNet, PRIMA, Historical
Japanese (HJ), and TableBank demonstrate that our model with SwinL backbone
achieves better segmentation performance than the existing state-of-the-art
approaches with the average precision of \textbf{93.72}, \textbf{54.39},
\textbf{84.65} and \textbf{98.04} respectively under one billion parameters.
The code is made publicly available at:
\href{https://github.com/ayanban011/SwinDocSegmenter}{github.com/ayanban011/SwinDocSegmenter}Comment: Accepted to ICDAR 2023 (San Jose, California
Beyond Document Page Classification: Design, Datasets, and Challenges
This paper highlights the need to bring document classification benchmarking
closer to real-world applications, both in the nature of data tested (:
multi-channel, multi-paged, multi-industry; : class distributions and label
set variety) and in classification tasks considered (: multi-page document,
page stream, and document bundle classification, ...). We identify the lack of
public multi-page document classification datasets, formalize different
classification tasks arising in application scenarios, and motivate the value
of targeting efficient multi-page document representations. An experimental
study on proposed multi-page document classification datasets demonstrates that
current benchmarks have become irrelevant and need to be updated to evaluate
complete documents, as they naturally occur in practice. This reality check
also calls for more mature evaluation methodologies, covering calibration
evaluation, inference complexity (time-memory), and a range of realistic
distribution shifts (e.g., born-digital vs. scanning noise, shifting page
order). Our study ends on a hopeful note by recommending concrete avenues for
future improvements.}Comment: 8 pages, under revie
SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation
Document layout analysis is a known problem to the documents research
community and has been vastly explored yielding a multitude of solutions
ranging from text mining, and recognition to graph-based representation, visual
feature extraction, etc. However, most of the existing works have ignored the
crucial fact regarding the scarcity of labeled data. With growing internet
connectivity to personal life, an enormous amount of documents had been
available in the public domain and thus making data annotation a tedious task.
We address this challenge using self-supervision and unlike, the few existing
self-supervised document segmentation approaches which use text mining and
textual labels, we use a complete vision-based approach in pre-training without
any ground-truth label or its derivative. Instead, we generate pseudo-layouts
from the document images to pre-train an image encoder to learn the document
object representation and localization in a self-supervised framework before
fine-tuning it with an object detection model. We show that our pipeline sets a
new benchmark in this context and performs at par with the existing methods and
the supervised counterparts, if not outperforms. The code is made publicly
available at: https://github.com/MaitySubhajit/SelfDocSegComment: Accepted at The 17th International Conference on Document Analysis
and Recognition (ICDAR 2023
Text-DIAE: A Self-Supervised Degradation Invariant Autoencoder for Text Recognition and Document Enhancement
In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement. We start by employing a transformer-based architecture that incorporates three pretext tasks as learning objectives to be optimized during pre-training without the usage of labelled data. Each of the pretext objectives is specifically tailored for the final downstream tasks. We conduct several ablation experiments that confirm the design choice of the selected pretext tasks. Importantly, the proposed model does not exhibit limitations of previous state-of-the-art methods based on contrastive losses, while at the same time requiring substantially fewer data samples to converge. Finally, we demonstrate that our method surpasses the state-of-the-art in existing supervised and self-supervised settings in handwritten and scene text recognition and document image enhancement. Our code and trained models will be made publicly available at https://github.com/dali92002/SSL-OC